Method and apparatus generating and applying security labels to sensitive data

ABSTRACT

The disclosure comprises a method, an apparatus, and instructions for controlling a computer to implement a security labeling service (SLS) to tag an electronic record or data stream with security labels to ensure compliance with access restriction requirements. The SLS tags a record or data stream with security labels according to constraints including jurisdictional (government regulation), organizational policy, and authorization of a subject of record (e.g. patient consent). The SLS consumes a vocabulary dictionary to interpret the record and the constraints to generate rules for tagging the data. The original record or data stream is then tagged according to the rules. The tagged output is used to ensure compliance with the security labels.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. Ser. No. 14/293,651, filedJun. 2, 2014, and claims the benefit of U.S. Provisional PatentApplication No. 61/831,059, filed on Jun. 4, 2013, in the United StatesPatent and Trademark Office, the disclosure of which is incorporatedherein by reference.

BACKGROUND 1. Field

Example embodiments relate to a method, a computer-readable medium andan apparatus that may apply security labels to sensitive data based onauthorization of a subject of record, organizational policies, andjurisdictional policies.

2. Description of the Related Art

Dissemination of sensitive data requires controlled, conditional accessaccording to different levels or types of authorization. Examples ofsensitive data include medical records, classified documents (e.g. DODtop secret), and the like. Access of such records may need to be limitedaccording to both the sensitive nature of particular records as well asthe purpose for requesting the records. For example, if a medical recordcontains data relating, e.g. HIV or psychiatric treatment, currentFederal laws prohibit unrestricted distribution of such records,allowing access only in specific circumstances.

Enforcement of data handling requirements can become a problem uponrecords being transferred from an original custodian source. Oncerecords are transferred, proper enforcement of data handling agreementsor obligations may or may not occur. For example, medical records couldbe exchanged between service provider organizations. Once the recordsare transferred, it becomes difficult to police the handling of thedata.

Solutions have been proposed based on maintaining separate databases fordifferent versions of data, each with different levels of accessprivileges. One such example involves compiling the databases andsubsequently providing access to the appropriate version of the dataaccording to access rights. However, this method poses a new problemwhenever access rights change. For example, if jurisdictional laws orregulations change to require different access rights to sensitive data,the entire database would need to be recompiled each time this occurs.This problem renders this particular method expensive, and potentiallyintroduces down time when the databases are being recompiled.

Accordingly, there is a need for a solution that avoids maintainingseparate databases and recompiling databases upon changes in accessrights.

SUMMARY

Accordingly, one or more embodiments provide a method to tag at runtimean electronic record or an electronic data stream with a security labelthat enables automated compliance and enforcement with each of a subjectof record authorization, an organizational policy, and a governmentregulation.

According to another aspect of an exemplary embodiment, the method mayreceive a retrieval request to retrieve an electronic record associatedwith a subject of record. The method may further determine a rule, by acomputer, for tagging the electronic record based on a vocabularydictionary, an authorization constraint, an organizational policyconstraint, and a government rule constraint, where the authorizationconstraint is provided by the subject of record. The method may furtherretrieve the electronic record from a repository. The method may furtherdecompose the electronic record into a decomposed data source. Themethod may further tag the electronic record at runtime with a securitylabel according to the determined rule and the decomposed data source.The method may further transmit the tagged electronic record.

According to another aspect of an example embodiment, a SecurityLabeling Service (SLS) may be provided to tag the electronic record orelectronic data stream. The SLS may comprise a rule generation service,an extraction engine, a rules engine, and a transformation engine. Therule generation service may generate rules using a vocabularydictionary, decision considerations, and rule constraints. Theextraction engine may decompose the electronic record or data streaminto a decomposed data source. The rules engine may output a directivebased on the rules from the rule generation service, the decomposed datafrom the extraction engine, and rule languages. The transformationengine may tag the electronic record or data stream using the directivesfrom the rules engine.

Additional aspects, features, and/or advantages of exemplary embodimentswill be set forth in part in the description which follows and, in part,will be apparent from the description, or may be learned by practice ofthe disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flow chart illustrating tagging data with security labelsaccording to an example embodiment.

FIG. 2 is a block diagram of a security labeling service for taggingdata with security labels according to an example embodiment.

FIG. 3 illustrates the rule generation service portion of the securitylabeling service in greater detail according to an example embodiment.

FIGS. 4A and 4B are a sequence diagram illustrating order of operationof the security labeling service according to an example embodiment.

FIG. 5 is a flow chart illustrating a process for automatic compliancewith security labels according to an example embodiment.

FIG. 6 is a block diagram of a system implementing the security labelingservice according to an example embodiment.

DETAILED DESCRIPTION

The foregoing and/or other aspects are achieved by providing a SecurityLabeling Service (SLS) to tag, at runtime, an electronic record or anelectronic data stream with a security label. The security label can beapplied to any sensitive data which requires access restriction. The useof the security label can thereby ensure proper handling of sensitivedata after the data leaves the custodian source.

FIG. 1 is a flow chart illustrating tagging of data. Sensitive data 101and access rights 102 are input into the SLS 110. The sensitive data 101may be in one or more formats expressed according to one or morelanguages or standards. For example, the sensitive data 101 may beexpressed in binary format or some human readable format such as XML.Further, when expressed in, e.g., XML, the sensitive data 101 may be ina specific format requiring a standard use of specific XML tags.However, embodiments are not limited to any particular format.

The SLS 110 interprets the access rights 102 in light of the sensitivedata 101 to determine whether any access restrictions apply. The accessrestrictions may include, for example, redaction, restriction orencryption requirements. However, the scope of access restrictions isnot so limited by these examples. If access restrictions from the accessrights are applicable to the sensitive data 101, the sensitive data 101is tagged with security labels by the SLS 110 and output as tagged data107. The tagged data is then used to ensure automated compliance withaccess rights 102 by enforcement 120.

FIG. 2 illustrates an example embodiment of the SLS. The SLS 110 mayinclude a rule generation service 201, an extraction engine 202, a rulesengine 203 and a transformation engine 204.

The rule generation service 201 may generate rules 215 based on theaccess rights 102. The rule generation service 201 outputs rules 215 tothe rules engine 203. The rules 215 are a formatted set of access rights102 which may be expressed in one or more formats that may be recognizedby the rules engine 203.

The extraction engine 202 receives an original data source 250. Theoriginal data source corresponds to the sensitive data 101, but may beexpressed as either an electronic record or an electronic data stream.The extraction engine 202 may have, for example, two basic functions.First, the extraction engine may indicate to the rule generation servicewhat type of electronic records or data stream will be tagged withsecurity labels. For instance, medical records may be stored as apatient's problem list from a HL7 CDA document. The second basicfunction of the extraction engine is to consume the original data sourceand decompose the original data source 250 into decomposed data 230 in astandard form which can be interpreted by the rules engine 203. Thedecomposed data 230 may occur in XML form, for example. However, thescope of the present invention is not limited to XML data. Thedecomposed data 230 may occur in any form which convenient for thespecific implementation as is well known in the art.

The rules engine 203 may receive the decomposed data 230, rules 215, andrule languages 231. The rules 215 may occur in a variety of formatswhich can be interpreted according to the rule languages 231. The rulesengine 203 generates an array or list of required labeling instructionsaccording to the rules 215 which are interpreted according to the rulelanguages 231 and the decomposed data 230. For example, if only a subsetof the rules 215 are applicable to the decomposed data 230, only asubset of the rules 215 are translated to the array or list of requiredlabeling instructions. The array or list of required labelinginstructions is then output as directives 233 to the transformationengine 204.

The transformation engine 204 uses the directives 233 output from therules engine to apply the security label to the original data source250. The transformation engine 204 then outputs the tagged data 107 astagged electronic record or electronic data stream which enablesautomated compliance and enforcement of the constraints of the subjectof record authorization, organizational policy, and governmentregulation.

FIG. 3 illustrates the rule generation service 201 in greater detailaccording to an example embodiment. The rule generation service 201receives a vocabulary dictionary 310, decision considerations 311 andrule constraints 312. The rule constraints 312 represent the accessrights 102 expressed in a format which can be interpreted by the rulegeneration service. The vocabulary dictionary may include vocabularythat can be used to interpret the rule constraints 312. The vocabularydictionary 310 may include standards generally recognized by anindustry. For example, as illustrated in FIG. 3 for the case of medicalrecords, the vocabulary dictionary may include Snomed-CT, RxNorm, ICD,and HL7 terminology. However, this is merely an example, and thevocabulary dictionary may include other standards and languages withoutdeparting from the spirit of the present invention.

The rule constraints 312 are a set of access rights which are expressedin a format which can be recognized by the SLS 110. The rule constraints312 may include, for example, authorization from a subject of record.For example, the authorization from a subject of record could be patientconsent for release of medical records that would otherwise not bedisclosed. The subject of record authorization may be conditional sothat access is permitted only upon satisfying conditions specified bythe subject of record.

The rule constraints 312 may further include organizational policy and agovernment regulation. The government regulation could be, for example,U.S. Privacy Law 42 CFR part 2 which governs medical record privacy. Theorganizational policy may be additional constraints by an organizationwhich handles the sensitive data, for example, a health care serviceprovider, medical record custodian, or other HIPAA covered entity orbusiness associate. The additional constraints provided by theorganizational policy may be, for example, whether to encrypt, mask, orredact records when the records are in transit or at rest.

The rules 215 may be expressed in one or more languages. Also, the rulesrepresent all of the rule constraints 312 without regard to relevance ofany sensitive data 101. Therefore, the rules 215 may or may not beapplicable to the sensitive data 101 at this stage. Accordingly, aspreviously indicated, the rules engine 203 outputs a directive 233 whichcontains only relevant rules.

The decision considerations 311 are a set of conditions which can modifyaccess rights 102 based on the output rules 215. The decisionconsiderations 311 may include, for example, a purpose for request ofthe sensitive data and also workflow considerations. Taking medicalrecords, again, as an example, the purpose for request may be fortreatment, or more specifically, for emergency treatment. Emergencytreatment, as opposed to routine treatment such as a physicalexamination, may allow access to records which would otherwise beredacted from the physician's access. As an alternative example, if thesensitive data 101 is classified information classified as DOD “secret”,then the purpose for request may be for designing electronic hardware tomeet a specification regarding signal to noise ration of a radiofrequency section. In this case, data access may be granted that wouldotherwise be redacted.

FIGS. 4A and 4B illustrate an example sequence of operations of the SLS110 in accordance with an exemplary embodiment. The sequence starts atthe top of FIG. 4A and ends at the bottom of FIG. 4B which is acontinuation of FIG. 4A. The sequence diagram of FIGS. 4A and 4Billustrates the interaction between the SLS 110 components—rulegeneration service 201, extraction engine 202, rules engine 203, andtransformation engine 204—which are depicted horizontally across the topof FIG. 4A. FIGS. 4A and 4B use medical records, again, only as anon-limiting example.

The sequence begins with operation 401 which is a valid request,external to the SLS, for medical records subject to access control. Thenext two operations involve the rule generation service 201. Inoperation 402, the SLS 110 retrieves the jurisdictional and policy rulesfrom rules generation service 201. In operation 403, the rule generationservice 201 responds to the request by retrieving the rule constraints312 and generating the annotation rules as rules 215. Note the dottedline which illustrates when the annotation rules next used which iselaborated below.

The next two operations involve the extraction engine 202. In operation404, the extraction engine 202 decomposes the original data source 250which is returned in operation 405 as decomposed data 230 which, in thenon-limiting example of FIGS. 4A and 4B, includes clinical facts, e.g. adiagnosis containing sensitive information requiring restricted access.

The next five operations involve the rules engine. In operation 406, theannotated rules (or rules 215) are input via the dotted line discussedabove, and used to create, in operation 407, a knowledge session in therules engine 203.

In operation 408, the clinical facts in the decomposed data 230 (fromthe second dotted line) are asserted in the rules engine 203, and inoperation 409, the rules engine is executed to generate the directives233. In operation 410, the directives are generated.

The next four operations involve the transformation engine. In operation411, the transformation engine 204 applies the directives 233 togenerate, in operation 412, the original data source tagged withsecurity labels. In operation 413, any handling instructions areasserted. The transformation engine responds, in operation 414, bygenerating the packaged document for transmission.

In operation 420 of FIG. 4B, the tagged data is output from the SLS.

FIG. 5 illustrates automated compliance with the security labels afterdata is tagged by the SLS. Note that the illustrations of FIG. 5 do notinvolve the SLS 110, but rather occur outside the SLS. FIG. 5 representsdetails of the enforcement 120 of FIG. 1. The tagged data 107, afterbeing tagged with security labels by the SLS 110 is able to ensureautomated compliance and enforcement of handling according to theapplied security labels when transmitted from a data custodian to anyreceiving system. Automated compliance may be cooperatively performed inpart by the data custodian and in part by the receiving system after thedata is tagged with security labels by the SLS 110.

In order for automated compliance to occur, either the data custodian orthe receiving system must guarantee the automated compliance. Therefore,before the data is transmitted, the data custodian must determine inoperation 501 whether the receiving system is configured toautomatically comply with the security labels. At operation 502, adecision is made whether to proceed to operation 503 a or 503 b. If thereceiving system is configured to automatically comply with the securitylabels, then the tagged data may be sent to the receiving systemaccording to operation 503 a. Otherwise, the data custodian mustguarantee compliance instead according to operation 503 b. For instance,if a record must be redacted, then the data custodian will redact therecord according to the tagged security label of tagged data 107 beforesending.

Depending on the rule constraints 312 and the transmission medium usedto transmit the tagged data 107 from the data custodian to the receivingsystem, the data may need to be encrypted before and/or aftertransmission. At decision regarding encryption is performed in operation504. If encryption is required, the tagged electronic data stream orrecords will be encrypted in operation 505 before transmission.Encryption may be required according to constraints 112 if transmissionoccurs, e.g. via the Internet. Encryption might not be required iftransmission occurs, e.g. via a local area network (LAN). Encryptionrequirements will be indicated in the tagged security label.

After automated compliance with the security labels is guaranteed, thedata may be transmitted in operation 506 from the data custodian to thereceiving party.

FIG. 6 illustrates an exemplary system which incorporates the SLS 110.In this non-limiting example, the data custodian 601 is one health careprovider, and the receiving party 602 is another health care provider.The receiving system is accessed by users 610 to request medical recordsfrom the data custodian 601. After authentication by the securityauthentication system 605, the users 610 may request records.

The data custodian accesses the healthcare classification system 604which hosts the SLS 110. Note that the hosting of the SLS 110 by thehealthcare classification system is only a non-limiting example shownfor illustrative purposes. Also in this non-limiting example, theorganizational policy is retrieved from a policy engine hosted on thesecurity authentication system 605 for use in the SLS 110.

The data custodian 601 uses the SLS 110 to transform original sourcedata 250 into tagged data 107. After ensuring automated compliance, forexample, by the illustration of FIG. 5, the data custodian transmits thepackaged data 603 to the receiving system for use by the users 610.

The following is an example of input and output data to the SLS 110 inXML format, particularly pertaining to medical records. This is anexample of a base rule for a clinical fact pertaining to a medicalrecord, specifically relating to HIV infection which must be restrictedaccording to Federal privacy laws:

<ClinicalTaggingRule code=“11450-4” codeSystem=“2.16.840.1.113883.5.25”codeSystemName=“LOINC” displayName=“Problem List”>  <ClinicalFactcode=“111880001” codeSystem=“2.16.840.1.113883.6.96” codeSystemName=“SNOMED CT” displayName=“Acute HIV”/>  <ActReasoncode=“ETREAT” codeSystem=“2.16.840.1.113883.5.8” codeSystemName=“urn:hl7-org:v3”/>  <ActInformationSensitivityPolicycode=“HIV”  codeSystem=“2.16.840.1.113883.1.11.20429”codeSystemName=“urn:hl7-org:v3”/>  <ImpliedConfidentiality>  <Confidentiality code=“R” codeSystem=“2.16.840.1.113883.5.25”  codeSystemName=“urn:hl7-org:v3”/>  </ImpliedConfidentiality></ClinicalTagginRule>

The following is another example of another rule, which happens to beexpressed in RedHat Drools Rule language, relating to substance abuse:

rule “Clinical Rule Substance abuse (disorder) REDACT” dialect “mvel”when  $xacml: XacmlResult(subjectPurposeOfUse==“TREAT”, eval(  pdpObligations.contains(   “urn:oasis:names:tc:xspa:2.0:recource:patient:redact:ETH”)))  $cd :ClinicalFact(codeSystem == “2.16.840.1.113883.6.96”,     code ==“66214007”,     c32SectionLoincCode == “11450-4”) then ruleExecutionContainer.addExecutionResponse(new   RuleExecutionResponse(“ETH”, “REDACT”, (String)Confidentiality.R, “42CRFPart2”, “ENCRYPT”,“NORDSLCD”, $cd.c32SectionTitle, $cd.c32SectionLoincCode,$cd.observationId, $cd.code, $cd.displayName, “SNOMED CT”)) end

The SLS 110 is able to process either of the exemplary above rulesbecause the rules engine 203 uses the rule languages 231 to decode rules215 which may occur in a plurality of formats and languages.

This is an example of an original data source 250 occurring in XMLformat, particularly a diagnosis of Acute HIV infection:

  <?xml version=“1.0” encoding=“UTF-8”?> <Problemxmlns=“http://hl17.org/fhir”>  <text>   <status value=“generated”/>  <div xmlns=“http://www.w3.org/1999/xhtml”>Acute HIV (Date: 21-Nov2012)</div>  </text>  <subject>   <type value=“patient”/>   <referencevalue=“patient/@example”/>  </subject>  <code>   <text value=“AcuteHIV”/>  </code>  <category>   <coding>    <systemvalue=“http://snomed.infor”/>    <code value=“111880001”/>    <displayvalue=“Diagnosis”/>   </coding>  </category>  <statusvalue=“confirmed”/>  <onsetDate value=“2012-11-12”/> </Problem>

The SLS 110, according to the above example embodiments, determines thatthe first of the two above rules is relevant. Accordingly, the SLS 110tags the above original data source 250 with security labels accordingto the relevant rule after the rules engine 203 transforms it into adirective 233. The transformation engine 204 outputs:

<?xml version=“1.0”encoding=“UTF-8”?> <Problemxmlns=“http://hl17.org/fhir”>  <text>   <status value=“generated”/>  <div xmlns=“http://www.w3.org/1999/xhtml”>Acute HIV (Date: 21-Nov2012)</div>  </text>  <subject>   <type value=“patient”/>   <referencevalue=“patient/@example”/>  </subject>  <code>   <text value=“AcuteHIV”/>  </code>  <category>   <coding>    <systemvalue=“http://snomed.infor”/>    <code value=“111880001”/>    <displayvalue=“Diagnosis”/>   </coding>   <securitylabel>    <category>    <term>http://hl7.org/security/ . . . /term/restricted</term>    <label>RESTRICTED</label>     <scheme>http://hl7.org/security/ . . ./confidentiality</scheme>    </category>    <category>    <term>http://hl7.org/security/ . . . /term/NORDSLCD</term>    <label>No Redisclosure with Consent</label>    <scheme>http://hl7.org/security/ . . . /RefrainPolicy</scheme>   </category>    <category>     <term> http://hl7.org/security/ . . ./term/ENCRYPT</term>     <label>ENCRYPT</label>     <scheme>http://hl7.org/security/ . . . /Obligations</scheme>    </category>  </securitylabel>  </category>  <status value=“confirmed”/>  <onsetDatevalue=“2012-11-12”/> </Problem>

Although many of the foregoing examples specifically involve medicalrecords, the scope of the invention is not limited to medical records. Aperson having ordinary skill in the art will appreciate that the SLS andother aspects of the present invention can be equally applied to manyother forms of sensitive data which require controlled access. Aspreviously indicated, another non-limiting example is the use of the SLSto control access to DOD records, e.g. secret or top secret records. Forinstance, access may be redacted for a user who has a sufficiently highclearance, but whose purpose for request does not require access to aspecific record. Also access may be fully restricted to a user withoutsufficiently high clearance. Again, these are only non-limitingexamples. Further, the present invention is not limited to anyparticular choice of computer language or format such as thenon-limiting examples above which happen to be expressed in XML.

The SLS 110 and other aspects of the foregoing may be implemented in oneor more processors or other hardware such as, but not limited to eithera programmable gate array or an ASIC. The SLS processing methodaccording to the above-described embodiments of the present inventionmay be further recorded in non-transitory computer readable mediaincluding instructions for controlling one or more processors or otherhardware to implement the SLS processing method. The media may include,alone or in combination with the controlling instructions, data filesdata structures, etc. Non-limiting examples of the non-transitorycomputer readable media include magnetic media such as hard disks,floppy disks, magnetic tape, etc.; optical media such as CD ROM or DVDdisks; magneto-optical media such as floptical disks; and hardwaredevices that are specifically configured to store and perform programinstructions, including but not limited to read only memory (ROM),battery backed up random access memory (RAM), flash memory, etc.Examples of instructions for controlling one or more processors or otherhardware include machine code such as produced by a compiler, code in amedium or high level language which may either be compiled (such as C orC++) or executed by a computer using an interpreter (such as JAVA). Thevarious components of the SLS, such as the rule generation service 101,extraction engine 102, rules engine 103, and transformation engine 104may each be separately implemented in dedicated hardware, or may beimplemented together in a single hardware device. For example, the rulegeneration service 101, extraction engine 102, rules engine 103, andtransformation engine 104 may be implemented in at least one computer,or, alternatively, may be implemented in at least one FPGA or ASIC.

Although a few embodiments of the Present Invention have been shown anddescribed, it would be appreciated by those skilled in the art thatchanges may be made in these embodiments without departing from theprinciples and spirit of the invention, the scope of which is defined inthe claims and their respective equivalents.

What is claimed is:
 1. A method comprising: tagging at runtime, by acomputer, an electronic record or an electronic data stream with asecurity label that enables automated compliance and enforcement witheach of a subject of record authorization, an organizational policy, anda government regulation.